Skip to content

proximity/allocation/direction: consistent output dtype and .name across backends#2728

Merged
brendancol merged 3 commits into
mainfrom
deep-sweep-metadata-proximity-2026-05-29-01
Jun 1, 2026
Merged

proximity/allocation/direction: consistent output dtype and .name across backends#2728
brendancol merged 3 commits into
mainfrom
deep-sweep-metadata-proximity-2026-05-29-01

Conversation

@brendancol
Copy link
Copy Markdown
Contributor

Closes #2723

proximity, allocation, and direction returned backend-dependent output metadata. This makes all four backends agree.

  • dtype: the bounded dask+numpy path declared float64 output meta via da.map_overlap(..., meta=np.array(())), even though the chunk function returns float32 and every other backend/path returns float32. Set the meta dtype to float32.
  • .name: dask backends leaked an internal dask op name (_trim-<hash>, _kdtree_chunk_fn-<hash>, asarray-<hash>) into the result .name, while numpy/cupy return None. Reset .name to None after building the result so all backends match. (xarray ignores a name=None kwarg when the data is a named dask array, so the reset has to happen after construction.)

attrs, coords, and dims were already preserved and stay that way.

Verified end-to-end on all four backends, both bounded and unbounded max_distance, for all three functions.

Test plan:

  • New parametrized regression test asserts declared dtype float32 and .name is None across all four backends, both bounded and unbounded, for proximity/allocation/direction.
  • Full test_proximity.py suite passes (93 tests).

)

The bounded dask+numpy path declared float64 output meta while the chunk
function returns float32, disagreeing with numpy, cupy, dask+cupy, and the
unbounded KDTree path. Set the map_overlap meta dtype to float32.

Dask backends also leaked an internal dask op name (e.g. _trim-<hash>) into
the result .name, where numpy/cupy return None. Reset .name to None in
proximity/allocation/direction so all four backends agree.

Add a parametrized regression test asserting declared dtype float32 and
.name None for all three functions, both bounded and unbounded, on every
backend.
@github-actions github-actions Bot added the performance PR touches performance-sensitive code label May 29, 2026
Copy link
Copy Markdown
Contributor Author

@brendancol brendancol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PR Review: consistent output dtype and .name across backends

Blockers (must fix before merge)

None.

Suggestions (should fix, not blocking)

None.

Nits (optional improvements)

  • test_proximity.py:299 - the new test asserts result.attrs == test_raster.attrs, but the test_raster fixture has empty attrs, so that line only checks empty == empty. It still guards against attrs being dropped, and the existing checks already cover attr preservation with the same fixture. Fine as-is; noting it only.

What looks good

  • The dtype fix hits the exact root cause. The bounded dask+numpy map_overlap declared float64 meta while the chunk function returns float32. meta=np.array((), dtype=np.float32) lines the declared dtype up with the real data and with the other three backends.
  • The .name = None reset is the right call. xarray ignores a name=None kwarg when the data is a named dask array, so resetting after construction is the only thing that clears the leaked dask op name. The inline comment says why.
  • Changes are surgical: three identical two-line resets plus one meta dtype, nothing else touched.
  • The regression test covers the bounded path (max_distance=10) and the unbounded/KDTree path (np.inf), across all four backends and all three functions. It checks the declared (pre-compute) dtype rather than the post-compute dtype, which is what general_output_checks already covers and what let this bug through.

Checklist

  • Output dtype consistent across backends (float32)
  • .name consistent across backends (None)
  • attrs / coords / dims preserved (unchanged by this PR)
  • Bounded and unbounded dask paths both covered by tests
  • No premature materialization or unnecessary copies
  • No benchmark needed (metadata-only fix)
  • No README/docstring change needed (no API change)

@brendancol brendancol merged commit e031d92 into main Jun 1, 2026
6 of 7 checks passed
brendancol added a commit that referenced this pull request Jun 1, 2026
…9-01

Re-sync after #2722/#2728 merged; keep both this PR's
test_return_annotation_consistency and the sibling tests now on main.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

performance PR touches performance-sensitive code

Projects

None yet

Development

Successfully merging this pull request may close these issues.

proximity/allocation/direction: output dtype and .name differ across backends

1 participant